ABOUT VARIABLES AND DATA

An important aspect of statistical data analysis and visualization involves creating new variables, calculating new variables from existing ones, and modifying the values of existing variables. Such new variables become the basis of new data objects, which in turn can be used for data analysis and visualization.

This help file presents an overview of variables and data objects, including information about ViSta's lists of variable and data object names; about the difference between short names and long-names; about how you use the lists of names to refer to variable and data objects; about how you create new variable and data objects; and about variable types and data types.

ABOUT SYMBOLS

You manipulate variables by typing simple arithmetic or algebraic statements in the listener window. These statements involve symbols for the mathematical operations and symbols for the variables. The mathematical operation symbols are the familiar ones you have always used: + for addition, / for division, etc. The variable symbols are the variable names you supplied when you created the variables.

NAMES - SHORT AND LONG 

Throughout ViSta, variables and data objects are referred to by names that you supply. However, there is always the possibility that a name might be repeated. If it is, the name does not unambiguously identify the variable or data object. To avoid this problem, ViSta constructs and uses "long-names" which are unambiguous. In contrast, the "short-names" are the names you supplied, which can be ambiguous.

DATA-OBJECT NAMES

Data objects have long names which consist of three parts.
   1) The first part is the name you supply
   2) The second part is a three character "extension"
   3) The third part is the "version number". 

The extension identifies the datatype of the data object. The version number is supplied by ViSta and is used to uniquely identify the data object. The number is 1 if the name is unique, 2 if there is already another data object with the same name, etc. 

The long name has the syntax 
   dataname.ext#n
That is, the long name of a data object consists of the NAME you supplied, followed by a dot, then the EXTENSION, followed by # (number sign) and the VERSION. The extension identifies the datatype of the data object.


VARIABLE NAMES

The long-name of a variable is constructed from the name you gave the variable and the name of the data object that contains the variable. Specifically, a variable's long-name is the long name of the data object (including extension and version number) followed by a dot and the name you gave to the variable;
   dataname.ext#n.varname
For example, if the data object MYDATA contains a variable named GPA, then the variable's long-name is MYDATA.GEN#1.GPA when there is no other data object named MYDATA, and when the data object's datatype is "general".

Variables do not have to be in data objects, but may be "free-standing" variables. When you first create and manipulate new variables they are free-standing, and are called "free" variables. Variables in data-objects are called "attached" variables.Free-standing variables have long-names which are the same as their given name since they are not in any data object. Note that there is the possibility of duplicate names.

There is always a "current" data object. This is the data object that is the current focus of data analysis and visualization. It is represented on the workmap by the highlighted data icon.


REFERING TO VARIABLES

In order to manipulate a variable, you must be able to specify which variable you wish to manipulate. A variable or data object can be referred to by its name, including either its' short-name or long-name. If you use a short-name, the most recently created variable with that name will be the one referred to.   

ViSta maintains information about the data and variable objects that have been defined during the session. The information can be referred to by various names. In addition to being able to refer to a data object by its' name (with or without the #n suffix), you can refer to the information by various symbols, all beginning with $. The symbols include:

   SYMBOL      INFORMATION
   $           the current data object
   $data       all data objects 
   $vars       the variables in the current data
   $all-vars   all variables, free and attached
   $data-vars  all attached variables
   $free-vars  all free variables 
   $NAME-vars  all variables in data object NAME
               (NAME must be a long data object name)

By typing one of these symbols in the listener you can see which data and variable objects have been defined. New variables are usually constructed from specific variables on these lists. In addition, new variables may involve calculations performed on variables on these lists. 

NOTE: Due to fundamental differences in the nature of data objects whose datatype is MATRIX (extension .mat), these data objects not included in the lists above. Furthermore, the variables in data objects whose datatype is MATRIX are not included in the features discused below.


CREATING VARIABLES WITH VIVA AND VAR

ViSta provides two ways of creating new variable objects, known as ViVa and Var. 

ViVa is ViSta's Variable language, and VAR is a function for creating variables. Each creates new variables which can be further manipulated with ViVa and Var, or can be made into data objects, using the DATASET function. The data object can then be analyzed and visualized by ViSta.

ViVa calculates variables using statements like ordinary arithmetic or algebraic statements. VAR calculates new variables using expressions in the Lisp language. 

The variables created by ViVa and Var are called "free" variables because they are not contained in any data object.

ViVa and VAR are described in the CREATING VARIABLES help item.  

THE DATASET FUNCTION

The "free" variable objects created by ViVa and VAR must be placed in a data object so that they can be analyzed and visualized. The DATASET function creates a new data object from a group of variables. It is described in the ABOUT MAKING DATA help file. 

TYPES OF VARIABLE AND DATA OBJECTS

Variables have a property known as their "variable type", a property which determines what kinds of calculations can be done on their values and what kinds of statistics and visualizations are appropriate. The types recognized by ViSta are "numeric", "ordinal" and "category". 

> Numeric variables are regular numbers, supporting ordinary arithmetic and the statistics and visualizations based on arithmetic. 
> Ordinal variables are those whose values only specify order, not number. Such variables are seldomly used in ViSta. 
> Category variables are those whose values only specify category membership. These variables are used in appropriate places in ViSta for determining the kinds of statistics and visualizations that can be computed.

By default, ViSta assumes variables are numeric. The VAR function's :TYPES keyword allows you to specify non-numeric variables. ViVa variables are always numeric.



